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We present a stochastic model for the size of a taxon in paleobiology, in which we allow for the 
evolution of new taxon members, and both individual and catastrophic extinction events. The model 
uses ideas from the theory of birth and death processes. Some general properties of the model are 
developed, and a fuller discussion is given for specific distributions of the time between catastrophic 
extinction events. Long tails in the taxon size distribution arise naturally from the model. 

PACS numbers: 87.23 



Random processes leading to probability distributions 
with slowly-decaying algebraic tails have been of consid- 
erable recent interest in physics, although they have been 
sporadically studied in other areas for many years. As 
probability laws with Pr{A > x} oc x~ a as x — > oo 
are naturally associated with scaling properties, self- 
similarity and fractals, it is tempting to propose fractal 
underlying mechanisms to explain the occurrence of these 
laws in particular contexts in nature. 

We address a problem of interest in paleobiology, where 
long-tailed distributions arise, and propose a model that 
requires no explicitly fractal underlying mechanism to ex- 
plain these distributions. The model predicts the distri- 
bution of the number of elements in a taxon, for example 
the number of species in a genus, and our analysis cov- 
ers both the distribution of species currently in existence, 
and the distribution of all species in the genus that have 
ever existed, some of which may now be extinct. 

Several other problems in taxonomy and genetics in- 
volve similar mathematical analysis, and the authors 
have briefly addressed elsewhere y} ^] the fitting of mod- 
els of this type to real biological data. More generally, the 
basic modelling approach in this paper involves killing 
at an exponentially distributed time a stochastic process 
for which the mean grows exponentially in time. The ap- 
pearance of power-law distributions in such contexts has 
been discussed in general by the authors elsewhere j|, 
and has applications in social phenomena [Q and other 
contexts not covered in the present paper. 

Modelling of taxon size has been of sporadic interest in 
the literature for some time || g, [7J , and there has been a 
steady accumulation of relevant data on both extinct and 
surviving taxonomic groups j|, ||. We shall not address 
mechanisms for species interaction in an ecosystem that 
may drive extinction or species proliferation, nor shall we 
address the shape of evolutionary trees (cladograms). 

In our model, a taxon comes into existence with a sin- 
gle representative species, genus or family at time t = 0. 
The number of members of the taxon grows with time as 
species mutate. Some species produce many new species, 
but any species may also become extinct, and a natural 
model for the proliferation and extinction of taxon mem- 



bers is the linear birth-death process (see, for example, 
]l0| , pp. 165-167 and pp. 265-266). A taxon member 
(referred to hereafter as a species) has in the time inter- 
val (t, t + h) a probability \h + o(h) of 'giving birth' to a 
new species, and a probability \xh + o(h) of 'dying'. As- 
suming independence of speciation and extinction events, 
if there are n species present at time t, the probability 
of one speciation occurring in (t, t + h) is Xnh + o(h), 
and the probability of one extinction is \inh + o{K) . Let 
M t denote the number of species that have ever existed 
(whether currently alive or not) at time t and N t the 
number of species currently alive at time t, and write 
PmA 1 ) = Pr{-Mt = m, N t = n}. Then for h > 0, 

p m ,n(t + h) = [l- n(X + fi)h + o(h)}p m!n (t) 

+ [Xh + o(h)](n - l)jJ TO -i jn _i(i) 

+ [fih + o(h)](n + l)jj m , n+1 (t) + o(h). 

The first term on the right-hand side corresponds to no 
change in M t or Nt in the time interval (t, t + h), the 
second to one birth, and the third to one death. All 
other events have probability o(h). Subtracting p m .n(t) 
from both sides, dividing by h and letting h — -> 0, we 
deduce the differential-difference equation 

-nPm,n(t) = -(X + (l,)np mn (t) + \(n-l)p m ^i }n -i(t) 

at 

+fj,(n + l)p m ,n+l{t)- 

This equation is valid for all integer m and n provided 
that we adopt the convention that p m , n (t) — if m < 
or n < 0. We measure time from the appearance of 
the first individual, so that p m ,n(0) = 1 if to = n = 1; 
Pm,n(0) = otherwise. Introducing the generating func- 
tion P(£,C,*) = E{£ M *C iVt } (where E denotes expecta- 
tion) one may readily derive JTT[ the partial differential 
equation P t = {£C 2 ^ ~ + MX + P( with initial con- 
dition P(£, C, 0) = £C Using the method of characteris- 
tics one finds iflltl that 



z 2 (C - z^e^ + Zl {z 2 - Qe A ^ 2t 
(C - ^i)e A « zlt + (z 2 - C)e A ? Z2 * 



(1) 
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where z\ and z 2 denote the roots of the quadratic equa- 
tion £Az 2 — (A + fi)z + fi = 0. Since the probability that 
M t species have ever existed up to time t is given by 
Pr{M t = m} = Y^=iPmn, setting ( = 1 in Eq. ([[]) we 
recover the generating function for the probability that 
Mt species have ever existed up to time t: 



E{£ Mt } 



A[(A£ - xi)e x ^ + (x 2 - \£)e X2i 



For brevity we have let x% — A^Zi, so that x\ and x 2 are 
the roots of x 2 — (A + p)x + Xp£ = 0. These roots are 
distinct for all p and A when £ < 1. The roots become A 
and p when £ = 1, so that the case A = p is degenerate 
for £ = 1, but this presents no difficulties in the subse- 
quent analysis. Using the results that x\ + x 2 = A + fi 
and X\x 2 = A/i£, establishes the form of the generating 
function needed below: 



l)[e (a:2t -e Xlt ] 
(X2 ~ A0e E2 * - (xi - A^e* 1 ' 



(2) 



To make contact with better known results on the num- 
ber of species currently alive, we note that setting £ = 1 
gives z\ = 1 and Z2 = £*/A and Eq. (Q) becomes Q 



- C) - - AOe-'^-^ 



if A ^ /x, 



E{c w t}= J A(l-C)-(M-AC)e- i ( A ^) 

{ i-^-O/Il + Ata-O]" 1 if A = fx. 

That (N t ) = e^ x ~^ 1 follows by differentiation, while ex- 
pansion of the generating function gives (|10|, p. 166) 



Pr{jV t = 0} = (^-^e-*( A -^)/(A-^e-*( A -wjTforn > 1 



Pr{7V t = n} 



{X- pfer 1 ^^ ( \- Xer^-ri 



[A - fie-^-^} 2 \A-^e-*( A -^) 



n-l 



In the limiting case A = (1, Pr{N t = 0} = Xt/(1 + Xt) 
and Pr{AT t = n} = {Xt) n - l /{l + Xt) n+1 for n > 1. When 
/i = (that is, there is a pure birth process) the solution 
reduces to that found by Yule || in his model of species 
evolution under a speciation rate A. 

There is considerable evidence for major catastrophic 
extinctions occurring within a relatively short period, 
these extinctions having been attributed to various 
causes, including major meteorite impacts [T^ ] and a hy- 
pothesised purely biotic mechanism called coevolution- 
ary avalanches |Q. To include catastrophic extinctions 
in our model, we require the probability density function 
f(t) for the time T between the start of a taxon and 
the next catastrophe. In the analysis below we carry a 
general f(t) as far as possible. The three specific models 
discussed here are proposed with some diffidence, though 
each has a certain natural appeal, and each may apply 
to appropriate subsets of paleological data. The common 
thread to all three models is that as t — > oo, 



with 1 /0 the mean time between catastrophic extinction 
events and either q = Q ov q = — 1. The small-< behavior 
is different in the three models, but this difference does 
not affect the dominant asymptotic behavior of the taxon 
size distribution. Using Eq. (j|) with some flexibility as 
to the value of q seems a reasonable approach. 

(a) The pure exponential model. As a first model one 
may assume that f(t) — 8e~ et for t > 0. This asserts 
that the waiting time for the next catastrophe is expo- 
nentially distributed, but effectively considers only one 
taxon: no account is taken of the fact that in a long time 
interval, many taxons should be initiated, while in a short 
time interval, it is likely that no taxons will be initiated. 
Subtle conditional probability effects are ignored. 

For models (b) and (c) below, we assume that catas- 
trophic extinction events occur in a Poisson process at 
rate 9, while taxon initiations occur in a Poisson process 
at rate p. Thus the probability density function for the 
waiting time between extinctions is ip(t) — 9e~ 0t , t > 0, 
while the waiting-time density for the start of the next 
taxon is x(t) = pe~ pt , t > 0. 

(b) The first new taxon model. Consider the time to 
the next catastrope for the first taxon initiated after the 
previous catastrophe. If we condition on the time T be- 
tween catastrophes, the conditional waiting-time density 
for appearance of a taxon is pe~ pt / (1 — e~ pT ), < t < T. 
The time between the appearance of the taxon and the 
next catastrophe therefore has the probability density 
function J7(t|T) = pe p ^~ T ^>/{l - e^), < t < T. We 
now average over T to deduce for the time from taxon 
commencement to the next catastrophe the density 



f(t)= / 0e-%(t|r)dr 
Jo 



e - P T-0T dT 

1 - e~P T 



»ln[l/(^)] as t -> 0, while 



It can be shown that f(t) 
f(t) ~ [p9/(p + 9)] e~ et as t -> oo. 

(c) Uniform taxon nucleation between catastrophes. 
The probability that there is at least one taxon initiated 
in the time interval of duration t between two succes- 
sive catastrophes is 1 — e~ pT . It is known |1| that for a 
Poisson process with rate p, conditional on there being n 
occurrences in a time interval of length r, the occurrence 
times have the same distribution as the order statistics of 
a set of n independent times, each uniformly distributed 
on the interval of length r. This suggests as a model for 
the probability density function for the time to the next 
catastrophe 



/(*) 



P Jt 



[1-e 



■PT] 



9e~ &T dT. 



fit) ~ constant x t q e 



-01 



(3) 



The prefactor (p + 9)/p is inserted to ensure that f(t) is 
non-defective, that is, integrates to unity As t — > oo, 

f(t)= ip + ^ e ~ et [l + 0(t-i)] [l + 0(e-")]. 

The size of a surviving taxon. We address briefly the 
distribution of the number N of species in a taxon that 
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are living just before a catastrophic extinction event oc- 
curs; equivalently this is asking for the distribution of 
taxon size today, the detail residing in the probability 
density function f(t) for the time since the taxon began. 
The case A < /i, in which a taxon is driven rapidly to 
extinction, is not considered. We shall consider only the 
case fit) = 8e~ et , t > 0. Since relatively simple expres- 
sions for Pr{iV t = n} are available, the direct calculation 
of the distribution of 



Pt{N = n} 



Pr{N t = n}f(t)dt 



becomes possible; the details are equivalent to those in a 
model of live taxa where both species and genera prolif- 
erate [jj and will not be given here. 

In the case A = /i, the distribution of N is reasonably 
rapidly decaying, though its dominant form is subtle: 



Pr{iV = n} 



ir 1/2 {e/\y> ■// ■ ■< 



5/4„-3/4 -2(8/A) L ' 



l/2„l/2 



The stretched exponential behavior is typical of the 
crossover behavior in problems of stochastic processes or 
statistical physics when exponential decay degenerates to 
algebraic decay as a parameter (here [i) passes through 
a critical value (here A). 

If A > fi, we find Pr{N = n} ~ constant xn" 1 ^/^' 1 ), 
so that Pr{iV > n} ~ constant x n~ e ^ x ^^ as n — > oo. 
The mean taxon size is infinite if A > [i + 9. Applications 
of these formulae to real data are given elsewhere |f7j . 

Proliferation between catastrophes. The problem of 
greater palaeobiological interest concerns the number of 
species that ever belong to a taxon. As before, let fit) be 
the waiting-time density for the time T after the emer- 
gence of a taxon to the next global extinction. The case 
fit) — 9e~ et is of most interest, but we carry generality 
when we may. Let the random variable M be the num- 
ber of species in a taxon that exists only between two 
successive catastrophes. With M t the number of species 
that have ever existed up to time t, we have 

Pm = Pr{M = m) = / Pr{M t = m}f(t)dt. 
Jo 

Using Eq. (|^) , the generating function for p m is given by 



m = E p^r = i + (£- i){i + (4) 



m—l 



where 



x(0 



A[. 



,x 2 t _ „xit 



]f(t)dt 



o {x 2 - A£)e X2 * - (xi - X^)e Xlt 



(5) 



We need to determine the asymptotic behavior of x(Q 
near £ = 1. If the function an d its first deriva- 

tive x'iOi respectively, remain finite at £ = 1, then 
the expected value (M) of M and the variance Var{M} 
of M are finite, and we have (M) = 1 + and 



Var{M} = 2 X '(1) + x(l) - x(l) 2 . The function X (£) 
is symmetric under interchange of x\ and X2- We shall 
identify X2 with the root that approaches A as £ — > 1 , and 
x\ with the root that approaches /ias^-> 1. Solving the 
quadratic equation for x\ and x-i exactly and expanding 
the solutions for 1 — £ — > 0, we record for later use that 

x x = A1 -A A1 (l-£)/(A- M )+0([l-e] 2 ), 
x 2 = A + A A1 (l-£)/(A- A1 ) + 0([l-£] 2 ). 

Provided that the integral in Eq. (J^) converges for £ — 1 , 
we find that the expected value of M is 



(M> 



1 



00 A[e< A -^* 



\}f{t)dt 



A 



(6) 



For A < n, the mean is finite for every density fit). The 
degenerate case A = /i can be analysed separately, or by 
taking the limit A — * ji from below inside the integral in 
Eq. (§), giving (M) = 1 + A(T), so (M) diverges in the 
degenerate case A = [i if the mean waiting time (T) for 
catastrophic extinctions is infinite. 

For A > fi, the integral in Eq. (^|) establishes that 
unless f(t) has at least exponential decay, the mean is 
necessarily divergent. If f(t) ~ constant x t r exp(— 6t) 
as t — > oo, then the mean taxon size if finite so long as 
A < /i + 9. Whether it is also finite in the critical case 
A = [i + 6 depends on the value of r. In particular, for the 
exponential density fit) — 9e~ et we find that (M) = oo 
if A > [i + 9, while (M) = 1 + A/(0 + [i - A) if A < [i + 9. 

To analyse the case A > [i, we shall rewrite the integral 
for x(0 m the equivalent form 



x(0 



A{1 



-(x 2 — Xl)t 



}fit)dt 



{x 2 - A£) - (xi - A£)e" 



-(x 2 — x{)t ' 



The exponentials are decaying functions of time, since 
x 2 — xi = X — [i + 2A/i(l - £)/(A -n) + 0([1 - £] 2 ) as 
1 — £ — > 0. Hence to leading order, 



x(0 



A 



]/(*)* 



K(l-£) + e-( A -A') t ' 



where k = A 2 / (A - 
write y = e~( A ~ M )*. 



^i) 2 . In the case /(i) 
we find that 



,-et 



if we 



x(0 



A6» 



0/(A- M )- 



Mcllin transform methods (see, e.g. [EBI o 



@, Ap- 



pendix 2) can be used to extract the asymptotic behavior 
of this integral and so determine the expansion for </>(£) 
near £ = 1. We find that for A > 9 + [i, 



HO = i - 



ttX[k(1 - £)f/( A -^) 
6sm[(Tv6)/(\- n)] 



while </>(£) = 1 - (A/0)(1 - £) ln^-^l ~ + • ■ ■ for 
A = 6 + [i. This asymptotic behavior of </)(£) suggests the 
following behavior olp m as m — > oo: 



constant x m 
constant x m 



-i-e/(A- 
2 In m 



A > 
A = 



[L, 
[L. 
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To derive this rigorously would require either a careful 
argument based around Darboux's Theorem |{is| , or the 
methods of Flajolet and Odlyzko (h|, or Tauberian The- 
orems supplemented by information about the ultimate 
monotonic decay of p m |l8|, ^(|. We obtain the same 
asymptotic behavior for the total number of species that 
ever existed as that found for currently living species: 

„ r „ r ^ -i f constant x m^ 8 ^^^, A > 9 + a 
PH ill > m ~ < 1. . „ 

[ constant x m Inm, A = a + fj,. 

More generally, note that as £ — ► 1, 

A f°° f{t)dt 



X(0 



It can be shown that if f(t) ~ <dt r e~ et (r > -1) then 

m> ~ (A - v) r+2 sin[(7T0)/ (A - /i)] 

for A > + ^i. In the borderline case X = 9 + fi, we find 
A6 



*(0 



{ln^l-O- 1 ]}^ 1 . 



(r + 1)6"-+ 2 

We predict the asymptotic forms 

f constant x m- 1_9/(A_ ' l) (mm) r , A > + 
Pm ~ [ constant x m- 2 (lnm) r+1 , A = 9 + fi. 



There has been a significant prior work on the mod- 
elling of populations subject to disasters in other con- 
texts, with particular emphasis on the time to extinc- 
tion in the process However, the principal conclu- 
sions of the present paper, especially those drawn for 
the properties of extinct taxa, appear to be new. We 
have shown that the competition between characteris- 
tic rates of species proliferation, individual species ex- 
tinction, and large-scale catastrophic extinction is able 
to generate long-tailed distributions of taxon size and 
consequent scaling properties and fractal interpretations 
without the need to assume an underlying fractal model. 
The formalism covers both currently live taxa, and taxa 
destroyed out by previous global catastrophic extinction 
events. Our results are based on a null model for prolifer- 
ations and extinctions. The validity of the model can be 
assessed by comparing the results established with em- 
pirical size distributions for living and fossil taxa (as in 
p|). Models for evolution with an underlying dynam- 
ics have been proposed The null model provides a 
useful benchmark against which the predictions of more 
detailed models may be assessed and its concepts and 
analytical methods may have applications in other areas. 
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